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ABSTRACT 



A method for performing color or grayscale image compres- 
sion that eliminates redundant and invisible image compo- 
nents. The image compression uses a Discrete Cosine Trans- 
form (DCT) and each DCT coefficient yielded by the 
transform is quantized by an entry in a quantization matrix 
which determines the perceived image quality and the bit 
rate of the image being compressed The present invention 
adapts or customizes the quantization matrix to the image 
being compressed. The quantization matrix comprises visual 
masking by luminance, and contrast techniques and by an 
error pooling technique all resulting in a minimum percep- 
tual error for any given bit rate, or minimum bit rate for a 
given perceptual error. 

10 Claims, 7 Drawing Sheets 
(1 of 7 Drawing(s) in Color) 
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IMAGE DATA COMPRESSION HAVING quality and the bit rate of the transmission of the image. The 

MINIMUM PERCEPTUAL ERROR perceived image quality is important because the human 

visual system can tolerate a certain amount of degradation of 

ORIGIN OF THE DISCLOSURE an image without being alerted to a noticeable error. 

. . , s Therefore, certain images can be transmitted at a low bit 

Themyenhon desm^hereinwas madeby an employee ^ tolerate ^^j^ 

ofme NahonMA^nauucs andSpaa Adrmmstahon andit aod ^ te ^ at a ma Wt ^ & to 

may oe manuiacturea ana usea oy ana ror the umtea states pr^c meil informational content 

Government for governmental purposes without the pay- r ~ tr% , t , r t . 

ment of royalties thereon or therefore. ^ 2 1 6 P^ 111 d^os^ a method for the compression of 

io image information based on human visual sensitivity to 

BACKGROUND OF THE INVENTION quantization errors. In the method of *216 patent, there is a 

quantization characteristic associated with block to block 

A. Technical Field of Field of the Invention: components of an image. This quantization characteristic is 
The present invention relates to an apparatus and method based on a busyness measurement of the image. The method 

far coding images, and more particularly, to an apparatus t5 0 f »216 patent does not compute a complete quantization 

and method for compressing images to a reduced number of matrix, but rather only a single scaler quantizer, 

bits by employing a Discrete Cosine Transform (DCT) in Two other methods arc available for computing DCT 

combination with visual masking including luminance and quantization matrices based on human sensitivity. One is 

contrast techniques as well as error pooling techniques an to based on a mathematical formula for human contrast sensi- 

yield a quantization matrix optimizer that provides an image 20 tiyit y faction, scaled for viewing distance and display 

having a mi n imum perceptual error for a given bit rate, or a resolution, and is disclosed in U.S. Pat No. 4,780,761 of S. 

minim u m bit rate for a given perceptual error. j, Daly et at The second is based on a formula far the 

B. Description of the Prior Art: visibility of individual DCT basic functions, as a function of 
Considerable research has been conducted in the field of viewing distance, display resolution, and display luminance. 

data compression, especially the compression of digital 25 The second formula is disclosed in a first technical article 

information of digital images. Digital images conmrise a entitled 'TAiminance-Model-Based DCT Quantization Far 

rapidly growing segment of the digital info rmati on stored Color Image Compression** of A. J. Ahumada and H. A. 

and communicated by science, commerce, industry and Peterson published in 1992 in the Human Vision, Visual 

government Digital images transmission has gained signifi- Processing and Digital Display III Proc. SPIB 1666, Paper 

cant importance in highly advanced television systems, such 32, and a second technical article entitled "An Improved 

as High definition television using digital information. Detection Model for DCT Coefficient Quantization" of H. A. 

Because a relatively large number of digital bits are required Peterson, et aL, published in 1993, in Human Vision, Visual 

to represent digital images, a difficult burden is placed on the Processing and Digital Display VI Proa SPIB. VoL 1913 

infrastructure of the computer communication networks pages 191-201 and a third technical article entitled "A visual 

involved with the creation, transmission and re-creation of detection model for DCT coefficient quantization" A. J. 

digital images. For this reason, there is a need to compress Ahumada, Jr. and H. A. Peterson, published in 1993, in 

digital images to a smaller number of bits, by reducing Computing in Aerospace 9, American Institute of Aeronau- 

redundancy and invisible image components of the images tics and Astronautics, pages 314—318. 

themselves. ^ The methods described in the '761 patent and the three 

A system that performs image compression is disclosed in technical articles do not adapt the quantization matrix to the 

U.S. Pat No. 5,121,216 of C E. Chen et al, issued Jun. 9, image being compressed, and do not therefore take advan- 

1992, and herein mcorparated reference. The '216 patent tage of masking techniques for quantization errors that 

discloses a transform coding algorithm for a still image, utilize the image itself. Each of these techniques has features 

wherein the image is divided into small blocks of pixels. Far 45 ^d benefits described below. 

example, each block of pixels may be either an 8x8 or 16x16 First, visual thresholds increase with background lumP^ 

block. Each block of pixels then undergoes a two dimen- nance and this feature should be advantageously utilized, 

sional transform to produce a two dimensional array of However; the formula given in the both referenced technical 

transform coefficients. For still image coding applications, a articles describes the threshold far DCT basic functions as a 

Discrete sine Transform (DCT) is utilized to provide the ^ function of mean luminance. This would normally be taken 

orthogonal transform. as the mean luminance of the display. However; variations in 

In addition to the *216 patent, the Discreet Cosine Trans- local mean luminance within the image will in fact produce 

form is also employed in a number of current and future substantial variations in the DCT threshold quantities. These 

international standards, concerned with digital image variations are referred to herein as "luminance masking* and 

compression, commonly referred to as JPEG and MPEG, 53 should be fully taken into account ~* 

which are acronyms for Joint Photographic Experts Group Second, threshold far a visual pattern is typically reduced 

and Moving Pictures Experts Group, respectively. After a in the presence of other patterns , particularly those of similar 

block of pixels of the '216 patent undergoes a Discrete spatial frequency and orientation. This reduction phenonv 

Cosine Transform (DCT), the resulting transform coeffi- enon is usually called "contrast masking." This means that 

cients are subject to compression by thresholding and quan- go a threshold error in a particular DCT coefficient in a par- 

tization operations. Threslwlding involves setting all coef- ticular block of the image will be a function of the value of 

ficients whose magnitude is smaller than a threshold value that coefSdent in the original image. The knowledge of this 

equal to zero, whereas quantization involves scaling a coef- function should be taken advantage of in order to compress 

ficient by step size a nd rounding off to the nearest integer. the image while not reducing the quality of the compressed 

Commonly, the quantization of each DCT coefficient is 65 image, 
determined by an entry in a quantization matrix. It is this Third, the method disclosed in the two referenced tech- 
matrix that is primarily responsible for the perceived image nical articles ensures that a single error is below a prede- 
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tcrmincd threshold- However, in a typical image there are represented by the DCT coefficients (c^ vJf e ). The DCT 
many errors of varying magnitudes that are not properly mask is based on parameters comprising DCT coefficients 
handled by a single threshold quantity. The visibility of this (c^ vAe X and display and perceptual parameters. The selec- 
error ensemble selected to handle all varying magnitudes is tion of the quantization matrix (q^e) comprises the steps 
not generally equal to the visibility of the largest error, but 5 of: (i) selecting an initial value of q^ v , e ; (ii) quantizing the 
rather reflects a pooling of errors over both frequencies and DCT coefficient c^ yJ , Q in each block b to form quantized 
blocks of the image. This pooling is herein term "error coefficient k^ yAe ; (iii) inverse quantizing k^^ by multi- 
pooling*' and is beneficial in compressing the digital infor- plying by q^ yJd ; (iv) subtracting the reconstructed coefficient 
mation of the image while not degrading the quality of the VkAi^a© ^ Qm c w/>jo to compute the quantization error 
image. 10 e^e, (v) dividing e^e ^ D CT mask m^^ to 

Fourth, when all errors are kept below a perceptual obtain perceptual error; (vi) pooling the perceptual errors of 

threshold, a certain bit rate will result, but at times it may be one frequency u,v and color 6 over all blocks b to obtain an 

desired to have an even lower bit rate. The two referenced entry in the perceptual error matrices p UfY #; (vii) repeating 

tftrhmojii articles do not disclose any method that would (i-vi) for each frequency u»v and color G; (viii) adjusting the 

yield a minimum perceptual error for a given bit rate, or a 15 values of q^ v 0 up or down until each entry in the perceptual 

minimum bit rate for a given perceptual error. It is desired error matrices p^ v 0 is within a target range, and entropy 

that such a method be provided to accommodate this need. coding the quantization matrices and the quantized coeffi- 

Fifth, since color images comprise a great proportion of cients of the image. 

55£ ^Tbl^^^ » BRIEF DESCRIPTION OF THE DRAWINGS 

The referenced technical articles provide a method for FIG. 1 is a block diagram of a computer network that may 

coinputing three quantization matrices for the three color be used in the practice of the present invention, 

channels of a color images, but do not disclose any method pjQ 2 schematically illustrates some of the steps 

far opti m isi ng the matrix for a particular color image, involved with the method of the present invention. 

Finally, it is desired that all of the above prior art FIG. 3 schematically illustrates the steps involved, in one 

limitations and drawbacks be eliminated so that a digued embodiinen t, with the formation of the quantization matrix 

image may be represented by a reduced mimber of digital optimizer of the present invention. 

bits while at the same time providing an image having a low . . . „ .„ . _ , . . 

perceptual error FIG. 4 schematically illustrates the steps involved, in 

' , . _ , ^ % . ^ 30 another einbodiment, with the formation of the quantization 

Accordingly, an object of the present invention is to matdx th ^ za: ^ me ^ senX invention, 

provide a method to rompress digital information yet pro- m „ ^ ^ . F , . , 

vide a visually optirnized image. Ha 5 mustrates a S€nes P lots showing the variations 

Another object of the present invention is to provide a ****** " ^^° f 

, e __ ... . ... . . _f . me present invention. 

method of compressing a visual image based on himinanre 35 

masking, contrast masking, and error pooling techniques. ™*- 6 is a plot of the contrast masking function involved 

A further object of the present invention is to provide a m mc V™*™ of me ****** inventioiL 

quantization matrix mat is adapted to me individual image 7 illustrates two P* 018 eacn a different digital 

being compressed so that either the grayscale or the color and each showing the relationship between the per- 

image that is reproduced has a minimal perceptual error for 40 czpmai error and the bit rate involved in image compression 

a given bit rate, or a minimum bit rate for a given perceptual &e present invention. 

error. FIG. 8 is composed of photos A and B respectively 

„ , „ illustrating an image compressed with and without the 

SUMMARY OF THE INVENnON present mventio^at eqiiaTbl rates. 

The invention is directed to digital compression of color 45 

or grayscale images, comprising a plurality of color chan- DETAILED DESCRIPTION OF THE 

nels and a plurality of blocks of pixels, that uses the DCT INVENTION 

transform coefficients yielded from a Discrete Cosine Trans- Referring now to the drawings wherein like reference 

form (DCT) of all the blocks as well as other display and numerals designate like elements, there is shown in FIG. 1 

perceptual pa ramet ers all to generate quan t iza t ion matrices 50 a block diagram of a computer network 10 that may be used 

which, in turn, yield a reproduced image having a low in the practice or the present invention. The network 10 is 

perceptual error for a given bit rate. Hie invention adapts or particularly suited for performing the method of the present 

customizes the quantization matrices to the image being invention related to images that may be stored, retrieved or 

compressed. transmitted. For the emrxxliment shown in FIG. 1, a first 

The present invention transforms a digital grayscale or 55 group of rhamiwfjTprf equipment 12 and a second group of 

color image into a compressed digital representation of that channelized equipment 14 are provided. Further, as will be 

image and comprises the steps of transforming each color further described, for the embodiment shown in FIG. 1, the 

pixel, if necessary, into brightness and color channel values, channelized equipment 12 is used to perform the storage 

down-sampling, if necessary, each color channel, partition- mode 16/retrieval mode 18 operations of the network 10 

ing each color channel into square blocks of contiguous 60 and, similarly, the channelized equipment 14 is used to 

pixels, applying a Discrete Cosine Transform (DCT) to each perform the storage mode 16/retrieval mode 18 operations of 

block in each color channel, selecting a DCT mask (rn^e) the network 10. As will be further described, the storage 

for each block of pixels in each color channel, and selecting mode 16 is shown as accessing each disk subsystem 20, 

a quantization matrix (q^e) for quantizing DCT transfer- whereas the retrieval mode 18 is shown as recovering 

mation coefficients (c^ vAG ) produced by the DCT transfer- 65 information from each disk subsystem 20. Each of the 

mation. The application of a Discrete Cosine Transform channelized equipments 12 and 14 may be a SUN SPARC 

(DCT) transforms the block of pixels into a digital signal computer station whose operation is disclosed in instruction 
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manual Sun' Microsystems Part #800-5701-10. Each of the 
channelized equipments 12 and 14 is comprised of elements 
having the reference numbers given in Table 1. 

TABLE 1 



Reference No. 


VXttmttni 


20 


Disk Subsystem 


22 




24 


CPU Processor 


26 


Random Access Memory (RAM) 


28 


Display Subsystem 



10 



30 



35 



In general, and as to be more fully described, the method 
of the present invention, being run in the network 10, 15 
utilizes, in part, a Discrete Cosine Transform (DOT), dis- 
cussed in the "Background" section, to accomplish image 
compression. In the storage mode 16, an original image 30, 
represented by a plurality of digital bits, is received from a 
scanner or other source at the communication channel 22 of 20 
the channelized equipment 12. The image 30 is treated as a 
digital file containing pixel data. The channelized equipment 
12, in particular the CPU processor 24, performs a DCT 
transformation, computes a DCT mask and iteratively esti- 
mates a quantization matrix optimizer. The channelized 25 
equipment 12 then quantizes the digital bits comprising the 
image 30, and performs run-length encoding and Huffman or 
arithmetic coding of the quantized DCT coefficients. Run- 
length encoding, arithmetic coding and Huffman coding are 
well-known and reference may be respectively made to the 
discussion of reference numbers 24 and 28 of U.S. Pat Na 
5, 170, 264, herein incorporated by reference, for a further 
discussion thereof. The optimized quantization matrix is 
then stored in coded farm along with coded coefficient data, 
following a JPEG or other standard. The compressed file is 
then stored on the disk subsystem 20 of the channelized 
equipment 12. 

In the retrieval mode 18, the ekanneMy+A equipment 12 
(or 14) retrieves the compressed file from the disk subsystem 
20, and decodes the quantization matrix and the DCT 
coefficient data. Hie channelized equipment 12, or the 
channelized equipment 14 to be described, men de-quantizes 
the coefficients by multiplication of the quantization matrix 
and performs an inverse DCT. The resulting digital file 
containing pixel data is available for display on the display 45 
subsystem 28 of the channelized equipment 12 or can be 
transmitted to the channelized equipment 14 or elsewhere by 
the communication channel 22. The resulting digital file is 
shown in FIG. 1 as 3C (IMAGE). The operation of the 
present invention may be further described with reference to 50 
HO. 2. 

FIG. 2 is primarily segmented to illustrate the storage 
mode 16 and the retrieval mode 18. FIG. 2 illustrates that the 
storage mode 16 is accomplished in r^\*nne>l)? eA equipment, 
such as channelized equipment 12, and the retrieval mode is 53 
accomplished in the same or another channelized 
equipment, such as channelized equipment 14. The chan- 
nelized equipments 12 and 14 are mterfaced to each other by 
the commimication channel 22. The image 30 being com- 
pressed by the operation of the present invention comprises 60 
a two-dimensional array of pixels, e.g., 256x256 pixels. In 
the case of a color image, each pixel is represented by three 
numbers, such as red, green, and blue components (RGB); 
in the case of a greys cale image, each pixel is represented by 
a single number representing the brightness. This array of 65 
pixels is composed of contiguous blocks; e.g., 8x8 blocks, of 
pixels representatively shown in segment 33. The storage 



mode 16 is segmented into the following steps: color trans- 
form 31, down-sample 32, block 33, DCT 34, initial 
matrices, quantization matrix optimizer 36, quantize 38, and 
entropy code 40. The retrieval mode 18 is segmented into the 
following steps: entropy decode 42, de-quantize 44, inverse 
DCT 46, un-block 47, up-sample 48 and inverse color 
transform 49. The steps shown in FIG. 2 (to be further 
discussed with reference to FIG. 3) are associated with the 
image compression of the present invention and, in order to 
more clearly describe such compression, reference is first 
made to the quantities listed in the Table 2 having a general 
definition given therein. 

TABLE 2 



Quantities 



u,v 
b 

e 



w «vM> 

p 

^OAW 

Y 



40 



indexes of the DCT frequency (ox basis 
function) 

index of a block of the image 
index of quantization color space 
DCT coefficients of an image block 

a nti Trflt ion si&txxx 
^jn ^ i if d^^f^ DO"! 1 coe£5cien£& 
DOT enor 

threshold ma triers (based on global mean 
luminance) 

threshold formula of Peterson et aL given 
in the article "An Improved Detection 
Model for DCT Coefficient Quantization" 
(previously cited) or similar fbsmsda 
hi m ii M iK^ MM^justgri threshold mafriffts 

tinFHiffl iifl^ maAfiig CXpOUent 

contrast masking exponent (Weber 
escpcoeGft) 

DCT Mask (threshold matrices adjusted 
for local luminance and contrast) 
perceptual error in a particular 
frequency u, v, block b, color 9 
perceptual error matrix 
spatial error-pooling exponent 
DC coefficient in brightness channel in 
block b 

mean luminance of the display 
Afczage brightness channel DC 
coefficient (typically 1024) 
target total perceptual ciror value 



Each pixel (step 31) is transformed from the original color 
representation (such as RGB) to a color representation 
consisting of one brightness signal and two color signals 
(such as the well known color space YO>Cr). This new color 
space will be called the quantization color space and its three 
channels indexed by 9. If the image is already represented in 
a space like YCbCr, or if the image is grayscale, then mis 
color transformation step is slopped. 

The three component color images (for example Y, Cb, 
and Cr) are men individually down-sampled by some fac- 
tors. Itown-sampling is well known and may be such as 
described in the technical article entitled "The JPEG still 
picture compression standard," by G. Wallace, published in 
1991 in Communications of tite ACM, volume 34, pages 
30-44. Typically, Y is not down-sampled, and Cb and Cr are 
each down-sampled by a factor of two in both horizontal and 
vertical dimensions. If the image is grayscale, this down- 
sampling step is skipped. 

Each down-sampled image in each color channel is then 
partitioned into blocks of contiguous (typically) 8x8 pixels 
(step 33). Each block of pixels in each color channel is 
subjected to the application of a Discrete Cosine Transform 
(DCT) (step 34) yielding related DCT coefficients. The 
two-dimensional Discrete Cosine Transform (DCT) is well 
known and may be such as described in the previously 
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incorporated by reference U.S. Pat No. 5,121,216. Hie 
coefficients of the DCT, herein termed c^ vA9 , obtained by 
the Discrete Cosine Transform (DCT) of each block of 
pixels comprise DC and AC components. The DC coefficient 
in the brightness channel is herein termed c 0 .oj>* which 
represents the average brightness of the block, the remain- 
der of the coefficients c^ vA0 are termed AC coefficients. 

The DCT (step 34) of all blocks (step 33), along with the 
display and perceptual parameters (to be described) and 
initial matrices, are all inputted into a quantization matrix 10 
optimizer 36, which is a process mat creates an optimized 
quantization matrix which is used to quantize (step 38) the 
DCT coefficients. The optimized quantization matrix is also 
transferred, by the communication channel 22 of the chan- 
nelized equipment 12, for its use in the retrieval mode 18 is 
that is accomplished in the channelized equipment 14. The 
quantized DCT coefficients (k^ vAG ) are entropy coded (step 
40) and then sent to the communication channel 22. Entropy 
coding is well-known in the communication art and is a 
technique wherein the amount of information in a message 20 
is based on log,,, where n is the number of possible equiva- 
lent messages contained in such information. 

At the receiving channelized equipment 14, an inverse 
process occurs to reconstruct the original block of pixels 
thus, the received bit stream of digital information contain- 25 
ing quantized DCT coefficients is entropy decoded 

(step 42) and then are de-quantized (step 44), such as by 
multiplying by the quantization step size q^ e to be 
described. An inverse transform (step 46), such as an inverse 



display and perceptual parameters 72 having typical values 
given in the below Table 4. 

TABLE 4 



Display and 
Perceptual Parameters 



Typical \fctues 



0.65 
4 

0.7 

40cd/m 2 
image grey levels 256 
Coay 102* 

r T 0u05 (veiling hnmnance, expressed as 

rstio of display t*^*** ^ Tww ff*n f * t 

p*, p 7 32, 32 (these define the number of pixels 

per deg r ee cf visual angle in ho rtg n ntal 
sod vertical Hlr wfftpiy] *f These values 
oanespond to a 236 x 256 pixel 
image at a viewing distance of 7.125 
picture heights) 



The display and perceptual parameters 72 are used to 
compute a matrix of DCT component visual thresholds by 
using a formula such as that more fully described in the 
previously referenced first, second, and third technical 
articles and which formula may be represented by expres- 
sion 1: 



t^^=viu,vAXp^ . . .] 

where V represents the threshold formula of Table 2, u and 

Discrete Cosine Transform (Dermis then applied to the 30 v f 0 in^dexes^of the DCT frequency, 0 is the quantization 
DCT coefficients (c*^^) to reconstruct the block of pixels. 
After the reconstruction, the block of pixels are unblocked, 
up-sampled, and inverse color transformed so as to provide 
a reconstituted and reconstructed image 30*. The quantiza- 
tion matrix optimizer 36 is of particular importance to the 3 ^ 
present invention and may be described with reference to 
FIG. 3. 

The quantization optimizer matrix 36 is adapted to the 
particular image being compressed and, as will be further . . . 

described, advantageously includes the functions of hrmi- 40 m % ^S^ent 52 is given by expression 2: 
nance masking, contrast masking, error pooling and select- ^ 
able quality. All of which functions cooperate to yield a 
compressed image having a minimal perceptual error for a 
given bit rate, or minimum bit rate for a given perceptual 
error. The quantization matrix optimizer 36, in one 43 
embodiment, comprises a plurality of processing segments 
each having a reference number and nomenclature given in 
Table 3. 



color space, Y is the mean luminance of the display, p x 
represents pixels per degree of visual angle horizontal and p y 
represents pixels per degree of visual angle vertical. 

The visual threshold values of expression (1) are then 
adjusted for mean block luminance in processing segment 
52. The processing segment 52 receives only the DC coef- 
ficient of the DCT coefficients indicated by reference num- 
ber 74, whereas segment 54 receives and uses the entire 
DCT coefficients. The formula used to accomplish process- 



<WAe = V*el 



where a, is a lununance-masking exponent having a typical 
value of 0.65, a,, vp e is the adjusted threshold, t^ v e is the 
un-adjusted threshold , Sox>* is the average of the DC terms 
of the DC coefficicntstcr ^e^gesenT^age, or may be 



TABLE 3 


Processing Segment 


Nipfl rf -flC Lit \\\ 


50 


CrVymjMitft V1S03I thttcsfrfjl^jfl 


52 


Adjust thresholds in each block far block 






54 


Adjust thresholds in each block far block 




component contmst 


56 


Quantize 


58 


Compute *y f*TitT fiffM 


60 


Scab error by DCT mask 


62 


Pbol error over blocks 


64 


Pooled error matrix =» target error? 


66 


Adjust quantization matrices 



50 



33 



simply a nominal value bflo24, roraneight (8) bit imaged 
and Ctm ^y is the DC term ofthc D CT for block b in the 
brightness channel. The termTy is the veiling luminance 
(luminance cast Dn the display by room lights etc.), 
expressed as ratio of display mean luminance Y. A typical 
value of r r is 0.05. 

As seen in FIG. 3, the luminance-adjusted thresholds of 
segment 52 are then adjusted for component contrast by the 
operation of a routine having a relationship as given by the 
below expression 3: 



60 



] 



The first step in the generation of the quantization matrix where m^ v/> e is the contrast-adjusted threshold, c^^ e is 
optimizer 36 is the derivation of a function DCT mask 70 65 the DCT coefficient, a^ ^is the corresponding threshold of 



which is accomplished by the operation of processing seg- 
ments 50, 52 and 54 and is determined, in part, by the 



expression 2, and w^ v e is the exponent that lies between 0 
and 1 and typically has a value of 0.7. Because the exponent 
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e may differ for different colors and frequencies of the 
DCT coefficients, a matrix of exponents equal in size to the 
quantization matrices is provided. The result of the opera- 
tions of processing segments 50, 52, and 54 is the derivation 
of the quantity m^^e herein termed "DCT mask" 70 which 
is supplied to the processing segment 60 to be described 
hereinafter. 

After the calculation of the DCT mask 70 has been 
determined, an iterative process of estimating the quantiza- 
tion matrix optimizer 36 begins and is comprised of pro- 
cessing segments 56, 58, 60, 62, 64, and 66. The initial 
matrices 35, which are typically fixed and which may be any 
proper quantization matrices, are typically set to a maximum 
permissible quantization matrix entry (eg., in the JPEG 
standard this mm rnrnnn value is equal to 255) and are used 
in the quantization of the image as indicated in processing 
segment 56. 

Each transformed block of the image contained in the 
initial matrix 35 is then quantized in segments 56 by 
dividing it, coefficient by coefficient, by the quantization 
matrix (q u v G ), and is rounded to the nearest integer as shown 
in expression 4: 

Segment 58 then computes the quantisation error e^^ 
in the DCT domain, which is equal to the difference between 
the de-quantized and original DGT coefficients and 
is shown by expression 5: 

From expressions 4 and 5, it may be shown that the 
maximum possible quantization error e„^ d 

The output of segment 58 is then applied to segment 60, 
wherein the quantization error is scaled (divided) by the 
value of the DCT mask 70. This scaling is described by 
expression 6: 

where is defined as the perceptual error at frequency 
u,v and color 0 in block b. The scaled quantization error is 
then applied to the processing segment 62. The processing 
segment 62 causes all the scaled errors to be pooled over all 
of the blocks, separately for each DCT frequency and color 
(u,v,6). The term "error pooling" is meant to represent that 
the errors are combined over all of the blocks rather than 
having one relatively large error in one block dominating the 
other errors in the remaining blocks. The pooling is accom- 
plish by a routine having a relationship of expression 7: 

w 

Where j^ vA0 is a perceptual error in a particular fre- 
quency u, v, color 6, and block b, p is a pooling exponent 
having a typical value of 4. It is allowed that the routine of 
expression (7) provide a matrix of exponents {3 since the 
pooling of errors may vary for different DCT coefficients. If 
the three color images have been down-sampled (step 32) by 
different factors, then the range of the block index b will 
differ for the three color channels. 

The matrices p^e of expression (7) arc the "perceptual 
error matrices'' and are a simple measure of the visibility of 
artifacts within each of the frequency bands and colors 
defined by the DCT basic functions. More particularly, the 
perceptual error matrix is a good indication of whether or not 



»9,780 

10 

the human eye can perceive a dilution of the image that is 
being compressed. The perceptual error matrices p^ v9 
developed by segments 56, 58, 60 and, finally, segment 62 
are applied to processing segment 64. 

5 In processing segment 64, each element of the perceptual 
error matrices p^ eis compared to a target error parameter 
y, which specifies a global perceptual quality of the com- 
pressed image. This global quality is somewhat like the 
entries in the perceptual error matrices and again is a good 

10 indication of whether the degradation of the compressed 
image will be perceived by the human eye. If all quantities 
or errors generated by segment 62 and entered into segment 
64 are within a delta of y, or if the errors of segment 62 are 
less than the target error parameter y and the corresponding 
quantization matrix entry is at a maximum (processing 

15 segment 56), the search is terminated and the current ele- 
ment of quantization matrix is outputted to comprise an 
element of the final quantization matrices 78 Otherwise, if 
the element of the perceptual error matrices is less than the 
target parameter u/, the corresponding entry (segment 56) of 

20 the quantization matrix is incremented. Conversely, if the 
element of the perceptual error matrix is greater than the 
target parameter u/, the corresponding entry (segment 56) of 
the quantization matrix is decremented. The incrementing 
and decrementing is accomplished by processing segment 

25 66. 

A bisection method, performed in segment 66, is typically 
used to determine whether to increment or decrement the 
initial matrices 55 entered into step 56. In the bisection 
method a range is established for q^ v 9 between lower and 

30 upper bounds, typically 1 and 255 respectively. The percep- 
tual error matrix p^ v e is evaluated at the mid-point of the 
range. If p UfV>e is greater than the target error parameter \y, 
then the lower bound is reset to the mid-point, otherwise the 
upper bound is reset to the mid-point This procedure is 

35 repeated until the mid-point no longer changes. As a prac- 
tical matter, since the quantization matrix entries q^ e in the 
baseline JPEG standard are eight bit integers, the needed 
degree of accuracy is normally obtained in nine iterations 
from a starting range of 1-255 (initial entry into segment 

40 56). The output of the program segment 66 is applied to the 
quantize segment 56 and then steps 56-66 are repeated, if 
necessary, for the remaining elements in the initial matrices. 

TTie preceding methods have been described with respect 
to a color image, consisting of three color channels, indexed 

45 by 0. If compression of a grayscale image is desired, then 
only one color channel exists (brightness, or Y), but other- 
wise all operations remain the same. 

The processing segments shown in HQ. 3 yield a com- 
pressed image with a resulting bit rate; however, if a 

50 particular bit rate is desired for the image, then the process- 
ing segments shown in FIG. 4 and given in the below Table 
5 are applicable. 

TABLE 5 

55 1 

80 Select desiredbit rate 

82 Set initial target perceptual error 

84 Optimize quantization matrix (56, 58, 60, 62, 

£q 64 and 66) 

86 Qi Hii yfrff^ 

88 Entropy code 

90 Decision box (Is the bit rate *» desired bit 
rate) 

91 Adjust target perceptual error 

65 

The processing segments 86-92, shown in FIG. 4 and 
given in Table 5, allow for the attainment .of a particular bit 
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rate and utilizes a second, higher-order optimization which invention takes this variation into account, whereas known 
if the first optimization results in a bit rate which is greater prior art techniques fail to consider this wide variation, 
than desired, the value of the target perceptual error param- In practice, the initial calculation of t^ v 9 should be made 
eter \p of segment 92 is incremented. Conversely, if a bit rate assuming a selected displayed luminance Y. The parameter 
results which is lower than desired, the value of the target 5 a, has a typical value of 0.65. It should be noted, that 
perceptual error parameter \j/ of segment 92 is decremented. luminance madting may be suppressed by setting a, equal to 
The sequence of FIG. 4 starts with the selection (segment 0. More generally, a, controls the degree to which the 
SO) of the desired bit rate followed by the setting (segment masking of FIG. 4 occurs. It should be further noted that the 
82) of the initial target perceptual error. The output of power function given in expression 2 makes it easy to 
segment 82, as well as the output of segment 92, is applied to incorporate a non-unity display Gamma, by multiplying a, 
to segment 84 which comprises segments 56, 58, 60, 62, 64 by the Gamma exponent having a typical value of 2.3. 
and 66, all previously described with reference to FIG. 3 and As previously discussed with reference to processing 
all of which contribute to provide an optimized quantization segment 54 of FIG. 3, the present invention also provides for 
matrix in a manner also described with reference to FIG. 3. contrast masking. Contrast masking refers to the reduction 
The output of segment 84 is applied to the quantize segment 15 in the visibility of one image component by the presence of 
86 which operates in a similar manner as described for the another. This masking is strongest when both components 
quantize segment 38 of FIG. 2. The output of segment 86 is are of the same spatial frequency, orientation, and location 
applied to the entropy code segment 88 which operates in a within the digital image being compressed. Contrast mask- 
similar manner as described far the entropy code 40 of FIG. ing is achieved in the present invention by expression (3) as 
2. 20 previously described. The benefits of the contrast masking 

To accomplish the adjustment of the bit rate, the output function is illustrated in FIG. 6. 
processing segment 88 is applied to a decision segment 90 FIG. 6 has a Y axis given in the quantity m^ vA0 (DCT 
in which the actual bit rate is compared against the desired mask) and a X axis given in the quantity c^ Y ^ e (DCT 
bit rate and, the result of such comparison, determines the coefficient) and illustrates a response plot 98 of the DCT 
described incrementing or decrementing of the target per- 25 rnaskm^^^ as a function of the DCT coefficient C M ^# for 
ceptual error parameter \|/. After such incrementing ox dec- the parameter w^^O.7 and ^^^=2. Because the effect of 
rementing the processing steps 84-90 is repeated until the the DC coefficient upon the luminance mflsiripg (see 
actual bit rate is equal to the desired bit rate, and the final FIG. 5) has already been expressed, the plot 98 does not 
quantization matrix 78 is created. include an effect of the DC coefficient Coo K and accom- 

It should now be appreciated that the practice of the 30 plishes such by setting the value of w o oe equal to 0. From 
present invention provides for a quantization matrix 78 mat FIG. 6 it may be seen that In this example the DCT mask 
yields minimum perceptual error far a given bit or a mini- (m^^) increases by over a factor of three as c^ KAe varies 
mum bit rate for a given perceptual error. Hie present between about 2 to 10. This DCT mask (m^ KAQ ) generated 
invention, as already discussed, provides for visual masking by processing segment 54 adjusts each block far component 
by luminance and contrast techniques as well as by error 35 contrast and is used in processing segment 60 to scale 
pooling. The luminance masking feature may be further (divide) the quantization error with both functions of the 
described with reference to FIG. 5. DCT mask ensuring good digital compression, while still 

FIG. 5 has a y axis given in the log function of a^ vA6 of providing an image having good visual aspects, 
expression (2) and a x axis given in block fimHn»nnf> It should be appreciated mat the practice of the invention 
measured in cd/m 2 . The quantity a^^ shown in FIG. 5 is 40 provides contrast masking so as to provide far a high quality 
based on a maximum display luminance L of 80 cd/m 2 , the visual representation of compressed digital images as corn- 
brightness color channel, a veiling luminance ratio r r of pared to other prior art techniques. 
0.05, and a grey scale (reference scale for use in black-and- The overall operation of the present invention is essen- 
white television, consisting of several defined levels of tially illustrated in FIG. 7. FIG. 7 has a Y axis given in 
brightness with neutral color) resolution of eight (8) bits. 45 bits/pixel of the compressed digital image and a X axis given 
The curves shown in FIG. 5 are plots for the DCT in perceptual error \p. FIG. 7 illustrates two plots 100 and 
coefficient frequencies given Table 6. 102 for two different grayscale images that were compressed 

and reconstituted in accordance with the hereinbefore 
TABLE 6 described principles of the present invention. From FIG. 7 it 

so is seen that the increasing bits/pixel rate causes a decrease in 
the perception error. 

The previously given description described herein yields 
desired quantization matrices q^ v>0 with a specified percep- 
tual error \jr. However, if desired one may have a quantiza- 
55 tion matrix q^ v e that uses a given bit rate ho with a 
minimum perceptual error y. This can be done iteratively by 
Detection threshold far a ltmiinance pattern typically noting that the bit rate is a decreasing function of the 
depends upon mean luminance from the local image region. perceptual error y, as shown in FIG. 7. In the practice of our 
More particularly, the higher the background of the image present invention a second order interpolating polynomial fit 
being displayed me riigher, to 60 to all previous estimated values of {h,\|/} to estimate a next 

usually called light adaptation** but it is called herein candidate \p, terminating when Ih-bJ<Ah, where Ah is the 
'luminance masking.** desired accuracy in bit rate. On each iteration a complete 

FIG. 5 illustrates this effect whereby higher background estimation of is performed, as shown in FIG. 4. 
luminance yields higher luminance thresholds. The plots An illustration of results obtained with the present inven- 
94 A-94D illustrate that almost one log unit change ina^ ^ 65 tion is shown in FIG. 8. FIG. 8A shows an image corn- 
might be expected to occur within an image, due to varia- pressed without the benefit of t using standard JPEG tech- 
tions in the mean luminance between blocks. The present niques. The image was 768x512 pixels, and should be 
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viewed at a distance of 7 picture heights (about 20 inches) (a) transmitting entropy coded image to a user of said 

to yeild 64 pixels/degree, FIG. 8B shows the same image digital image; 

compressed to the same bit rate using the present invention. (b) decoding said entropy coded image; 

The visual quality is clearly superior in FIG. 8B, using the (c) decoding said quantization matrix to derive said DCT 

present invention. The limitation of FIG. 8A is especially 3 transformation coefficients; 

noted by the objectionable contouring in the sky. (d) applying an inverse Discrete Cosine Transform to 

It should now be appreciated that the practice of the derive said block of pixels; 

present invention provides a perceptual error that incorpo- (e) up-sampling each color channel image; and 

rates visual masking by luminance and contrast techniques, (f) inverse color transforming each color pixel to recover 

and error pooling to estimate the matrix mat has a minimum 10 me reconstructed color image. 

perceptual error for a given bit rate, or minimal bit rate for 3. The method of transforming an image according to <; 

a given perceptual error. All told the present invention claim 1, wherein lunnnance-adjusted thresholds are deter- 

provides a digital compression technique that is useful in the mined by the following expression: 
transmission and reproduction of images particularly those 

found in high definition television applications. is / rr+q^^y^r \ Q 

Further, although the invention has been described rela- ^« 8 ^ \ TTn ) 

uve to a specific ernbodiment thereof, it is not so limited and 

many modifications and variations thereof now will be where „ me luMiianee^djusted thresholds, r y is a 

readily apparent for those skilled m the art in light of the veiling i^nance quan tity a, is a limiinance-inasking eV>- 

above teachings. 20 nm having a vake of about 0 65> t^ ve arethe un-adjusted * 

Wnat I claim is: . thresholds, w r is an average of the DC terms of the DCT 

1. A inethod for transfonmng a digital color image into a ^^nts of the image, and Wr is the DC / 
compressed representation of said image comprising the c,^^ b of a brightness (Y) chSmd. I 
stc P s °^ 4. The method of transforming an image according to 

(a) transfcaming each color image pixel to three color & Halm l, wherein said DOT mask is detennined by the 
channel values corresponding to brightness and two following expression: 

color signals, 

(b) down- sampling each transformed color channel by a r ■ ^ 
predetermined factor, = <h*>& Max y i,| | J 

(c) partitioning each color channel image into a set of 30 

square blocks of pixels, each block having an index b, . « _ _ . 

, n . . * „ . m , _ where c uvJlQ is the DCT coefficient, v vA 0 is a correspond- 

(d) applvmg a Discrete Cosine Transform t (DCT) to ^ bioSluUnance-adjusted ifartSTnd v^is an 
transfom^saidttock of pixels into digital signal e»pc^eiit that lies between 0 and 1 andis typically about 0.7, 
represented by DCT transformation coefficients 35 ^ where w =0 

( c h^a©)» 5. The method of transforming an image according to 

(e) selecting luminance-adjusted thresholds for each t wnerem ^ pooHng of perceptual errors is deter- 
block of pixels based on parameters comprising said mined by ^ foii owm g expression: 

DCT transformation coefficients c^ vA6 , and display 

and perceptual parameters; ^ w 

(f) selecting a DCT mask (m^ vAB ) for each block of 1 \}**j>j>P \ 
pixels based on parameters comprising said DCT trans- * ' 

formation coefficients c^ Yjy G and luminance-adjusted t . . , ^ 

thresholds, and said ^y'a^pcrcc^panm^; where We is a perceptual error m a particular frequency 

r«wi^rw..«™ Q ««™«™™^^. \,vJL««»«tiw> ^ u,v, colore and block b and p is a pooling exponent having 

(g) selectmg a quantization matrix (q^ e ) comprising the 45 of about 4. 

(iTstleckng an initial value of q^^ . A computer involved in convcr^g a digital cok* image 

(ii) quaSg the DCT coefficient r w in each block representation of said image, said com- 
b to farm quantized coefficient V m, 0 ; puter composing: 

(iii) de-quantizing the quantized coefficients by multi- » (a) means for forming each color P^l into three 
prying mem by q_ 0 00101 cnanne l values corresponding to brightness and 

(iv) subtracting the de-quantized coefficient two color channels; 

<W> from c^ © to compute the quantization ( b ) mcans down-sampling each color channel; 

error e^ vA0 ; 1 (c) means far partitioning each color channel into a set of 

(v) o^dingeJ vAO bymeIXTrinaski^ 55 square blocks of pixels, each block having an index b, 
perceptual errors j^^; (d) means for applying a Discrete Cosine Transform 

(vi) pooling the perceptual errors of one frequency u,v (DCT) to transform said blocks of pixels into digital 
and color channel 6 over all blocks b to obtain an signals represented by DCT transformation coefficients 
entry in the perceptual error matrices p^ ne ; (c^a©)' 

(vH) repeating steps i-vi for each frequency u,v,6; 60 (e) means for selecting a DCT mask (m u>vAe ) for each 

(vin) adjusting the values q u v e up or down until each block of pixels based on parameters comprising said 

entry in the perceptual error matrix p^e is within a DCT transformation coefficients c^ vA0 and display and 

target range perceptual parameters; 

(h) entropy coding said quantization matrices and the (f) means for selecting a quantization matrix comprising: 
quantized coefficients of said image. 65 (i) means for selecting an initial value of q„, Vf0 ; 

2. The method of transforming an image according to (ii) means for quantizing the DCT coefficient c^v^.e in 
claim 1 further comprising the steps: each block b to form quantized coefficient k^ vAe ; 
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(iii) means for de-quantizing the quantized coefficients 

by multiplying them by q^; / ry+^wSiAr \ * 

(iv) means for subtracting the de-quantized coefficient 0^*0- ^ Y+n / 

^.oWa© fr° m c n.t>Ao to compute a quantization 

error e u,vAo» 5 wnere • are luminance-adjusted thresholds, r v is a 

(v) mean S for ^vxtog ^.e by the DCTmaskn^, A9 v ^ quaatitv ^ is a lurrrinance-n^king «po- 
to obtam perceptual errors; nent having a value of about 0.65, ^ vjB are the uTadj.istea 

(vi) means for pooling the perceptual errors of one tfc _ cfc _ M . « „„ «.-r»TT 

i _ j i a mi, . . thresholds. Convis an average ox the DL terras or the D(_T 

frequency u v and color 9 over all Mocks b to obtain of thclnage, and W is the DC 

an entry in the perceptual error matrix M 10 ^ W ^ (Y)S£nW 

(vu) means for adjusting the values q. „ 0 up or down 0 i„„.wrf ^fZ„ n „^i„„ „„ • ,„„ ™_j 

until each entry in the per^tual eW matrix is . I™, 6 cw ^ pu ^ r ^voWed juiwnverting an image accord- 

uiuii hui in me ya^tuoi oiui uuuu u m gtodama6,whercmsaidIxn^maskisdetenninedbythe 

within a target range. following expression: 
(g) means for entropy coding said quantization matrix and 

the quantized coefficients of said image. 15 i%*e 

7. The computer involved in converting an image accord- = Max | 1, Ca * b£ 1 
ing to claim 6 further comprising: 1 1 1 J 

(a) means for transmitting entropy coded image to a user 

of said digital image; where c^ v ^ 0 is the DCT coefficient, a^ TA0 is a correspond- 

/k\ «»*o« 0 f«, ^^„ n ..m on ^ F • 0 _. 20 ing block luminance-adjusted threshold, and w„ is an 

0>) means for decodmg said entropy coded image; exponent that lies between 0 and 1 and is typically abo^it 0.7, 

(c) means for decoding said quantization matrix to derive ^ where w aoe =0. 

said DCT transform coefficients; 10 Xhe SSapaer Evolved in converting an image 

(d) means for applying an inverse Discrete Cosine Trans- according to H*n'm 6, wherein said pooling of perceptual 
form to derive said block of pixels; Z5 errors is determined by the following expression: 

(e) means for reassembling said blocks to recover the 

color channels / _ .. \ ^ 

(f) means for up-sampling said color channels to recover \ b ) 
full resolution color channels; and 

30 

(g) means for inverse color transforming said color chan- where j*^,© is a perceptual error in a particular frequency 
nels to recover the reconstructed image. u,v, color 6 and block b and p is a pooling exponent having 

8. The computer involved in converting an image accord- a typical value of about 4. 
ing to claim 6, wherein luminance-adjusted thresholds are 

determined by the following expression: * + * * * 
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